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Abstract 

Multivariate, real-valued functions on M. d induce matrix-valued functions on the 
space of d-tuples of n x n pairwise-commuting self-adjoint matrices. We examine the 
geometry of this space of matrices and conclude that the best notion of differentiation 
of these matrix functions is differentiation along curves. We prove that C 1 real- valued 
functions induces C 1 matrix functions and give a formula for the derivative. We also 
show that real-valued C m functions defined on open rectangles in R 2 induce matrix 
functions that can be m-times continuously differentiated along C m curves. 

1 Introduction 

Every real- valued function defined on R induces a matrix- valued function on the space 
of n x n self-adjoint matrices by acting on the spectrum of each matrix. Likewise, 
each real-valued function / defined on an open set Q C R rf induces a matrix-valued 
function F on the space of <i-tuples of n x n pairwise-commuting self-adjoint matrices 
with joint spectrum in Q. Let S = (S 1 , . . . , S d ) be such a d-tuple diagonalized by a 
unitary matrix U as follows: 



S r = U 



( 



\ 



U* V Kr <d. 



Denote the joint spectrum of S by cr(S) := {xj = {x\, . . . , xf) : 1 < i < n} and define 

/ f(xi) 



F(S) := U 



V 




(1.1) 
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This paper will show that certain differentiability properties of the original function 
pass to the matrix function. Even for a one-variable function, this is nontrivial. Let 
/ € C 1 (1R,1R) and consider the simple case of differentiating the associated matrix 
function F along a C 1 curve S(t) ofnxn self-adjoint matrices. At first glance, it 
seems reasonable to write S(t) = U(t)D(t)U*(t), for U(t) unitary and D(t) diagonal. 
Then F(S(t)) = U(t)F(D(t))U*(t) and we can differentiate using the product rule. 

However, there is no guarantee that we can decompose S(t) into its eigenvector and 
eigenvalue matrices so that the eigenvectors are even continuous. As demonstrated 
by the following example from [8], eigenvector behavior at points where distinct 
eigenvalues coalesce can be unpredictable. Specifically, let 

i / OOB(f ) Bin(f ) \ 
S(t) = I for t 0, and 5(0) = 0. 

V sin(f) -cos(f) J 

_ i 

For t 7^ 0, the eigenvalues of S(t) are ± e t 1 and their associated eigenvectors are 



± 



/ cos(I) ^ 




( Bin(i) \ 


| and ± | 




V Mi) ) 




V -cos(|) ) 



Thus, even an infinitely differentiable curve can have singularities in its eigenvectors. 

The differentiability of matrix functions defined from one-variable functions is dis- 
cussed frequently in the literature (see [2] , [4] , [6] ) . The most comprehensive result is 
by Brown and Vasudeva in [3] , who prove that m-times continuously differentiable 
real functions induce m-times continuously Frechet differentiable matrix functions. 

If a matrix function is defined using a real- valued function on M. d as in (1.1), its 
domain is the space of ci-tuples of pairwise-commuting n x n self-adjoint matrices, 
denoted CS%. For d > 1, the space of ci-tuples ofnxn self-adjoint matrices is denoted 
S d and for d = 1, is denoted S n . 

In Section 2, we analyze the geometry of CS^ and conclude that the best notion of 
differentiability for functions on this space is differentiation along curves. If we fix 
S in CS%, Theorem 2.3 characterizes the directions A in S% such that there is a C 1 
curve S(t) in CS% with 5(0) = 5 and 5'(0) = A. In Theorem 2.5, we show that the 
joint eigenvalues of Lipschitz curves in CS d can be represented by Lipschitz functions. 

In Section 3, we examine the differentiability properties of induced matrix functions. 
Specifically, in Theorem 3.1, we show that a C 1 function induces a matrix function 
that can be continuously differentiated along C 1 curves. We then calculate a formula 
for the derivative along curves and in Theorem 3.6, prove that it is continuous. 
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In Section 4, we consider higher-order differentiation. With additional domain re- 
strictions, in Theorem 4.1, we show that an induced matrix function is m-times 
continuously differentiable along C m curves. We also calculate a formula for the 
derivatives and in Theorem 4.5, show they are continuous. In Section 5, we discuss 
several applications of the differentiability results. 

There is an alternate approach for inducing a matrix function from a multivari- 
ate function; the d matrices S 1 ,...,S d are viewed as operators on Hilbert spaces 
H 1 , . . . , H d and F(S) is viewed as an operator on H 1 (g> • • • <g> H d . Brown and Va- 
sudeva generalize their one- variable result to these matrix functions in [3] . 

Before proceeding, I would like to thank John McCarthy for his guidance during this 
research and the referees for their many useful suggestions. 

2 The Geometry of CS* 

Let 5 = (5 1 , . . . , S d ) be in CS d (or S d ) and let a* = (xj, . . . , xf) be in a(S). Define 
IISII := max ||5 r || and llxjll := max \x^\, 

l<r<d l<r<d 

where ||5 r || is the usual operator norm. Observe that CS d is not a linear space; if A 
and B are pairwise-commuting d-tuples, the sum A + B need not pairwise commute. 
Thus, neither the Frechet nor Gateaux derivatives can be defined for functions on 
CS d because both require the function to be defined on linear sets around each point. 

Recall that CS d is the set of elements 5 G S d with [5 r , S s ] = for all 1 < r, s < d. 
Thus, CS d is the zero set of the polynomials associated with f^zli commutator op- 
erations and so is an algebraic variety. A result by Whitney [10] says every algebraic 
variety can be decomposed into submanifolds that fit together 'regularly' and whose 
tangent spaces fit together 'regularly' For a manifold N, let TN denote the tangent 
space of N and let T$N denote the tangent space based at a point 5 in N. To make 
Whitney's conditions more precise, we need the following definition: 

Definition 2.1 A stratification of X is a locally finite partition Z of X such that 

(i) Each piece M a £ Z is a smooth submanifold of X. 

(ii) The frontier of each piece M a \M a is either trivial or a union of other pieces. 
Then X is called a stratified space with stratification Z. 

Example 2.2 Consider CS\, the space of pairs of self-adjoint, commuting 2x2 
matrices. In the following definitions, a, b,c,d£ M. Define 




U G 52 is unitary, a ^ b, c / d 
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M, 



M 4 



- <((S 
■, {{u 
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a 
o 



a 
6 

a 
a 



f7 



c 
d 



U £ S2 is unitary, c 7^ d j 
U £ S2 is unitary, a 7^ 6 j 



)} 



zs c/ear i/iai CS§ = UMj. Moreover, each Mi is a manifold and Mi \M, is either 
trivial or a union of other Mj. Thus, the partition {Mi} is a stratification o/CSf- 

In general, a decomposition of CS% into pieces will be related to the number and 
multiplicity of the repeated joint eigenvalues of the elements of CS%. 

Whitney's result says CSf t has a stratification Z with further regularity. Specifically, 
let {M a } denote the pieces of Z and define TCS^ := UTM a . Then, TCS% is also 
a stratified space, and we call Z a Whitney stratification of CS%. Given a function 
F : CSf t — >■ S n , one type of derivative is a map 

DF : TCSf t -»• T5 n such that DF\ TMa : TM„ -> TS n 

is the usual differential map for each M Q . In Theorem 3.8, we analyze such maps. 
However, these differential maps cannot be easily generalized to analyze higher-order 
differentiation. Furthermore, the space TCS^ will only contain a subset of the vec- 
tors tangent to CS^- Example 2.4 will show that strict containment often occurs. 

To retain information about all tangent vectors, we will mostly study differentiation 
along differentiable curves. We first determine which A in S% are vectors tangent to 
C at a given point S. This is equivalent to the following question: 

Is there a C 1 curve S(t) in CS% with 5(0) = S and S'(0) = A? 

For an element S G CS% with distinct joint eigenvalues, Agler, McCarthy, and Young 
in [I] gave necessary and sufficient conditions on S and A for such a C 1 curve to 
exist. We extend their result to an arbitrary element S. Fix S G CS% and A G S%. 
Let U be a unitary matrix diagonalizing each component of S such that the repeated 
joint eigenvalues appear consecutively. Renumbering the Xi's if necessary, define 



jyr 



u*s r u 



\ 



V 1< r < d. 



(2.2) 



For each r, define the two matrices 



U* A r U 



r r := 



■13 



if Xi 



otherwise. 



(2.3) 
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Then T r is a block diagonal matrix. Each block corresponds to a distinct joint 
eigenvalue of 5 and has dimension equal to the multiplicity of that eigenvalue. 

Theorem 2.3 Let 5 € CS^ and A € S*. There exists a C 1 curve S(t) in CS^ with 
5(0) = S and S'(0) = A iff 

[D r , T s ] = [D s ,T r ] and [f r , f s ] = Vl<r,s<i 

Proof: (=►) Assume S(t) is a C 1 curve in C5£[ with 5(0) = 5 and S"(0) = A. Define 

fl(t) := U* S(t) U, 

where U diagonalizes S as in (2.2). Then R(t) is a C 1 curve in CS"^ with 5(0) = D 
and 5'(0) = V. We will first prove that 

[D r , T s ] = [D s , T r ] and [r r , r s ] {j = V 1 < r, s < d and (ij) such that = Xj . 

We will use those commutativity results to conclude 

[f r ,f s ] =0 V 1 < r,8 < d. 

Since R(t) is C 1 in a neighborhood of t = 0, we can write 

R r (t)= D r + T r t + h r (t) Vl<r<d, 

where \h r (t)ij\ = o(\t\) for 1 < i,j < n. For each pair r and s, the pairwise- 
commutativity of R(t) implies 

= [R r (t),R s (t)\ 

= [D r + r r t + h r (t), d s + r s t + h s {t) } 

= ([D r ,h s (t)] + [h r (t),D s ] + [h r (t), h s (t)]) 

+ ( [D r ,r s ] + [r r ,D s ] + [r r ,h s (t)} + [/i r (*),r a ] )t 

+ [r r ,r s ]t 2 , (2.4) 

where the term [D r , D s ] was omitted because it vanishes. Fix t ^ and divide each 
term in (2.4) by t. Letting t tend towards zero yields 

= [D r ,T s ] - [D s ,T r ]. (2.5) 

Choose i and j such that xi = Xj. Then, the ij th entry of (2.4) reduces to 

= [h r (t), h s (t)] i:j + ( [T r , h s (t)] tj - [V s , h r (t)]ij ) t + [T r , T% 3 t 2 . 

Fix t ^ and divide both sides by t 2 . Letting t tend towards zero yields 

o = [r,rv (2.6) 
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Fix r and s with 1 < r, s < d. Since f r and V s are block diagonal matrices with 
blocks corresponding to the distinct joint eigenvalues of S, it follows that r r r s and 
r s f r are also such block diagonal matrices. Thus, if i and j are such that X{ / Xj, 

[ r, f s ] = ( rf s - f s f r ) = o. 

Now, fix i and j such that Xi = Xj. By the definition of T, 

n 

f-pr -psl \ fir -ps -ps fir 

k=l 

Epr -ps ps pr 

Err -ps -ps -pr 
1 ife 1 kj ~ 1 ifc 1 

{k:x k ^Xi} 

where the last equality uses (2.6). Thus, it suffices to show that if Xk ^ Xi, 

pr ps _ ps pr — ri 

Assume x^ / Xj, and fix g with x 9 k ^ xj. Apply (2.5) to pairs r, q and s, q to get 

[£)<?,r r ] = [D r ,Ti] and [D 9 ,r s ] = [D s ,T q ]. 
Restricting to the ik th and kj th entries of the previous two equations yields 

(2.7) 

r s lk (x!-4) = ^ fc (xf-x|) r^-(x|-xj) = r«.(x|-xp. 

Since Xj = Xj and x^ / x?, we can replace all the x^'s with Xj's in (2.7) and solve for 
the T r and V s entries. Using these relations gives 



,. „ _ , ., .,. ^ik( X i X k)^kj( x i X k) _ ^ik( x i x k)^ k j( x i x k) _ 
^ ik kj ik kj / q q\o / q q\o "> 

( x i ~ x l) \ x i - x k) 



as desired. Thus, [f r ,f s ] = 0. 

(<=) Fix S G C5^[ and A 6 5{J and let 17, 77, and T be as in the discussion preceding 
Theorem 2.3. Assume 

[D r , T s ] = [D s ,T r ] and [f r , f s ] = Vl<r,s<i (2.8) 

Define a skew-Hermitian matrix Y as follows: 

r r?. 



Y ■ ■- 



— r, TT if Xi / X-i 

otherwise, 
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where the q is chosen so that x^ — x'j / 0. Observe that Y is independent of q because 
the ij th entry of the first equation in (2.8) is 

~^ij( x i ~ x j) = ^ij{ x i ~ x j)- 

Now, define the curve S(t) by 

S r (t) := Ue Yt [D r + tf r ] e~ Yt U* V 1 < r < d. 

Then, S(t) is continuously differentiable. Because Y is skew-Hermitian, e Yt is unitary. 
Since D r and f r are self-adjoint, S(t) G S%- By a simple calculation using (2.8), 

[S r (£),S s (f)] =0 Vl<r,s<d. 

Thus, S(t) G CS*. By definition, 5(0) = S. For each r, 

(S r )'(t) = U (Ye Yt [D r + tf r ] e~ Yt + e Yt [t r ] e~ Yt - e Yt [D r + tt r ] Ye~ Yt ) U*, 
so that 

(S r )'{0) = U([Y, D r ] + f r ) U* = A r . 
Thus, S"(0) = A, and is the desired curve. □ 

Example 2.4 Let / G CS 1 ^ 6e t/te identity element. By Theorem 2.3, there is a 
continuously differentiable curve S(t) in CS% with 

S{0) = I and S'(0) = A if and only if A G CS%. 

Thus, the set of vectors tangent to CS% at I is CSf r For a Whitney stratification of 
CS% and piece M a containing I, the tangent space TjM a is linear. Since CS% is not 
linear, TjM a is a strict subset of the set of tangent vectors at I. 

The conditions of Theorem 2.3 actually imply that if S in CS^ has any repeated 
joint eigenvalues, the set of vectors tangent to CS% at S is not a linear set. Then, 
for any Whitney stratification of CS% and piece M a containing S, the tangent space 
T$M a is a strict subset of the vectors tangent to CS^ at S. We will thus focus on 
differentiation along curves rather than differential maps. 

To evaluate an induced matrix function along a curve in CS^, we apply the original 
function to curve's joint eigenvalues. We are therefore interested in the behavior of 
the joint eigenvalues of curves in CS^. 

If S(t) is a continuous curve in S n , a result by Rellich in [8] and [9] states that the 
eigenvalues of S(t) can be represented by n continuous functions. A succinct proof is 
given by Kato in [7, pg 107-10]. With slight modification, the arguments show that 
the eigenvalues of a Lipschitz curve in S n can be represented by Lipschitz functions. 
These results generalize as follows: 
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Theorem 2.5 Given a Lipschitz curve S(t) in CS d defined on an interval I, there 
exist Lipschitz functions x±(t), ...,x n (t) : I — > M. d with a(S(t)) = {xi(t) : 1 < i < n} . 

Proof: As the proof is a technical but straightforward modification of the one- 
variable case, it is left as an exercise. □ 

Theorem 2.5 provides a specific ordering of the eigenvalues of S(t) at each t. This 
ordering may differ from the one in (2.2), where joint eigenvalues appear consecu- 
tively. However, Theorem 2.5 implies that the eigenvalues of a Lipschitz curve S(t) 
are Lipschitz as an unordered ra-tuple. Specifically, fix t* and denote the eigenvalues 
of S(t*) by {xi : 1 < i < n}. Then, for t near t*, there is a constant c such that 

min I max \\x{ — Xi(t)\\ I < c\t* — t\, 
y i<j<?i J 

where the minimum is taking over all reorderings of the {x{\. If we require that 
eigenvalues are ordered as in (2.2), we will use Theorem 2.5 to conclude that the 
eigenvalues are Lipschitz as an unordered n-tuple. 



3 Differentiating Matrix Functions 

Recall that every real-valued function defined on an open set VL C R d induces a 
matrix function as in (1.1). We denote its domain, the space of d-tuples of pairwise- 
commuting n x n self-adjoint matrices with spectrum in Q, by CS%(Q). 

If the original function is continuous, the matrix function is as well. Specifically, 
Horn and Johnson proved in [6, pg 387-9] that a one-variable polynomial induces 
a continuous matrix polynomial. The arguments generalize easily to multivariate 
polynomials, and approximation arguments imply that the matrix function of a con- 
tinuous function is continuous. We now consider differentiability and prove: 

Theorem 3.1 Let S(t) be a C 1 curve in CS d defined on an interval L and let Q be 
an open set in R d with <r(S(i)) C If f G C^fyR), then 

(i) ^F(S(t))\ t=t * exists for all t* G /. 

(ii) IfT(t) is another C 1 curve in CS d with T(0) = S{t*) and T'(0) = S'(t*), then 

iF(T(t))\ t=0 = £ t F(S(t))\t=t*. 

We say an open set fi C l d is a rectangle if Q = I 1 x • • • x I d , and an open set 
Cl C C d is a complex rectangle if Q = (I 1 + iJ 1 ) x • • • x (I d + iJ d ), where each 
F and J r is an open interval in R. Before proving Theorem 3.1, we assume / is 
real-analytic and prove Proposition 3.2. See [6] for the one-variable case. 
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Proposition 3.2 Let S(t) be a C curve in CS d defined on an interval I. Let f2 be 
an open rectangle in M. d with a(S(t)) <Z Q,. If f is a real-analytic function on Q, then 

jftF(S(t))\t=t* exists and is continuous as a function oft* on I. 

The proof of Proposition 3.2 requires the following two lemmas. 

Lemma 3.3 Let O be an open rectangle in M. d and let S £ CS d with cr(S) C Q. 
Each real- analytic function on can be extended to an analytic function defined on 
a complex rectangle Cl such that a(S) is in Cl. 

Proof: The result follows from basic properties of complex functions. It should be 
noted that Q need not contain fi. □ 

Lemma 3.4 Let & be an open rectangle in C d and let S G CS d with a(S) Cfi. /// 
is an analytic function on Cl, then 

F(S) = j^- d J— J f(C\ ■ ■ ■ , C'XC 1 / - s 1 )- 1 ■ ■ ■ (C d i - s'r 1 d( l ... dc d , 

where C r is a rectifiable curve strictly containing a(S r ), and C l x • • • x C d C Cl. 

Proof: Horn and Johnson prove the formula for a one- variable function in [6, pg 
427]. Their derivation generalizes easily to multivariate functions. □ 

Proof of Proposition 3.2: 

For ease of notation, assume d = 2 and define 

K (t) ■= (CI - S r {t))- 1 V 1 < r < 2, 

where £ r is in the resolvent of S r (t). Fix to £ I and extend / to an analytic function 
on a complex open rectangle Cl containing a(S(to)). Choose rectifiable curves C 1 and 
C 2 such that C 1 x C 2 C & and each C r strictly encloses the eigenvalues of S r (to). 
By Theorem 2.5, the joint eigenvalues of S(t) are continuous and by Lemma 3.4, 

F{s{t)) = (2^f b L /(cl ' c2) Rl{t) R2{t) dcld(2 ' 

for t sufficiently close to to- Direct calculation gives 

£ t R r (t)\ t=t * = R r (t*) (S r )'(t*) R r (t*) for 1 < r < 2 and t* near t„. 

It can be easily shown that, for t* sufficiently close to to, we can interchange integra- 
tion and differentiation to yield 

{2m) z J C2 J Cl \ 

+ R\t*)R 2 {t*) (S 2 )'(t*) R 2 (t*)^Jd( 1 d( 2 . (3.9) 
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As each (S r )'(t) is continuous and all other terms in (3.9) are uniformly bounded 
near to, we get ^F(S(t))\t=t* is continuous at t* = to- n 

Proof of Theorem 3.1: 

Observe that the theorem holds for polynomials: (i) follows from Proposition 3.2 and 
(ii) follows from the formula in (3.9). Fix t* £ I. Let / be an arbitrary C 1 function 
and let p be a polynomial that agrees with / to first order on a(S(t*)). 

By Theorem 2.5, there are Lipschitz maps Xi(t) := (xj(t), . . . ,xf(t)), for 1 < i < n, 
representing a(S(t)) on /. From the multivariate Mean Value Theorem, we have 

\\(F - P)(S(t))\\ = m^\(f - p)( Xi (t))\ 

^Kf-p^x^-if-p^Xiit*))] 
max|V(/-p)(x*(t))-(x i (t)- a ; l (t*))| 



= max 



< 



m f x E \{&-&) <w - ( 3 - 10 ) 

1 r=l 

where x*(t) is on the line connecting x«(i) and Xi(t*) in R rf . For t near t*, continuity 
implies x*(t) G fl. As / and p agree to first order on a(S(t*)), from (3.10), we have 

\\(F-P)(S(t))\\=o(\t-t*\). 

Hence 

F(5(t)) - F(S(t*)) P(S(t)) - P(S(t*)) 



ast-tt 



t-t* t-t* 
Therefore, 

iF(S(t))\ t=t * exists and equals iP(S(t))\ t=t * . 
Applying the same argument to F(T(t)) at t = gives 

f t F{T(t))\ t=0 exists and equals |P(T(t))| t=0 . 
As (ii) holds for P(t), we must have 

In the following proposition, we calculate an explicit formula for the derivative. 
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Proposition 3.5 Let S(t) be a C 1 curve in CS% defined on an interval I and let 
t* G I. Let Q be an open set in R d with a(S(t)) C and let f G C 1 (fi,R). Then, 

iF(S(t))\t=u = u(j2 r r J£p) + [Y,F(D)])u*, 
where U diagonalizes S(t*) as in (2.2) and the other matrices are as follows: 

D r ._ u* [ S r (t*)] u r := U* [(S r )'(t*)} u 



1] 



ij 



ifxi 



otherwise 







if Xi ^ Xj 
otherwise, 



where the joint eigenvalues of S(t*) are given by {xj = (xj, . . . , xf ) : 1 < i < n} and 
q is chosen so x q -—x\i^ 0. 

Proof: Let t* £ L and define the C 1 curve T(t) by 

T r (t) := U e Yt [D r + tt r ] e~ Yt U* V 1 < r < d. 

Then, T(t) is the curve defined in the proof of Theorem 2.3 for S := S(t*) and 
A := S'{t*). It is immediate that T(t) G CS%, T(0) = S(t*), and T'(0) = S'(t*). By 
Theorem 3.1, it now suffices to calculate ^F(T(t))| t= o. First, we diagonalize each 
D r + tT r . Let p be the number of distinct joint eigenvalues of S(t*). By definition, 

/ T\ \ 



\ 



pr 
1 P 



V 1 < r < d, 



where each T[ is a fc| x ki self-adjoint matrix corresponding to a distinct joint eigen- 
value of S with multiplicity k[. It follows from Theorem 2.3 that 

[f r , f s ] = 0, which implies: [r[, Tf ] = V 1 < r, s < d and 1 < I < p. 

Thus, for each there is a k\ x fc; unitary matrix V\ such that V[ diagonalizes each 
r[. Let V be the n x n block diagonal matrix with blocks given by V±, . . . , V p . Then, 
V is a unitary matrix that diagonalizes each f r . By the diagonalization in (2.2), the 
joint eigenvalues of D are positioned so that 



D r 



( c i4i 

V 



V 1 < r < d, 



(3.11) 



where is the k\ x k\ identity matrix and c[ is a constant. Equation (3.11) shows 
that conjugation by V will not affect D r . Define the diagonal matrix 



A' 



r 



V* T r V V 1 < r < d, 
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and rewrite T(t) as follows 

T r {t) = U e Yt V [D r + tA r ] V*e~ Yt U* V 1 < r < d. 
Now we directly calculate F(T{t)) and f t F(T(t))\ t=Q 

F(T(t)) = Ue Yt V F (d 1 + tA 1 ,...,D d + tA d ^j V*e~ Yt U* 

= Ue Yt V (F(D) + tY J ^ r ^(D) + o(\t\)] V*e~ Yt U*, 

whereM F (D) is defined by 



df. 

dx r 



(D) := 



V 1< r < d, 



and the first-order approximation of F follows from the approximation of / on each 
diagonal entry of the ci-tuple of diagonal matrices. Differentiating F(T(t)) and setting 
t = gives 

%F(T(t))\t=o = U(J2V A r <£(D)V* + [Y,VF(D)V*])U* 

r=l 

= ^(E £r°L(D) + [Y,F(D)})u*, 



r=l 



where conjugation by V leaves F{D) and each §§r{D) unchanged because those 
matrices have decompositions akin to that of D r in (3.11). □ 

We now prove that the derivative calculated in Proposition 3.5 is continuous in t*. 

Theorem 3.6 Let S(t) be a C 1 curve in CS d defined on an interval I . Let O be an 
open set in R d with a(S(t)) cQ.Iffe C l (Q,R), then 

^F(S(t))\t=t* is continuous as a function of t* on L. 

For the proof, we will require the following lemma: 

Lemma 3.7 Let S(t) be a C l curve in CS d defined on an interval I. Let $7 be an 
open, convex set in R d with a(S(t)) G Q. If f £ C 1 (il,]R) and to € /, then there is a 
neighborhood Iq around to such that 



\iF(S(t))\ t=t *\\ <C max \&{x)\ forallt*£l , 

1 s ^-d^x 



df_ 

dx s 

where C is a constant and E is a convex, precompact open set with E G 0,. 
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Proof: Let to £ I and fix a bounded interval Iq around to with Tq C /. By Theorem 
2.5, the joint eigenvalues of S(t*) are continuous on Iq. Thus, there exists an open, 
precompact, convex set EcR d such that E C ft and a(S(t*)) C E for each t* G J . 
Fix t* G Iq- By Proposition 3.5, 



|iW))| t=t * = c/(E rg:(£)) + [r, /(£>)] (3.12) 



r=l 



where J7, D r , T r , and Y are functions of t* defined in Proposition 3.5, and the joint 
eigenvalues of S(t*) are denoted by Xj, for 1 < i < n. Observe that the matrix in 
(3.12) can be rewritten as 



E r^(D) + [Y,F(D)} 



r=l 



d 



^ ( x *) if x * _ 



r=l 

1 f( x i)-f( x j) 



(3.13) 



if x^ 7^ x^ , 



where q is such that x\ / x'. As shown in the proof of Theorem 2.3, the value -^rz; 



is independent of q whenever x\ / x\ 



Recall that for a given n x n self-adjoint matrix A and an n x n unitary matrix U, 

max|([/ A U*) i:j \ < n\\U A U*\\ = n\\A\\ < n 2 max |^ |. (3.14) 

ij ij 

It is immediate from (3.12), (3.13) and (3.14) that 

d 



F(S(t))\ 



t=t * 1 1 < n max 



r=l 



+ nmax 



r<7 fjxi) ~ fjxj) 



ij ™9 _ „,<? 
Xj X^ 



, (3.15) 



where the first maximum is taken over with Xj = Xj, the second maximum is 
taken over with Xj 7^ Xj, and g is such that x\ 7^ xj. Fix (i,j) with x% 7^ Xj. 
Since / G C 1 (i?), we can apply the multivariate Mean Value Theorem as follows: 



\f(x t )-f(x 3 )\ = |V/(a 

d 



(3.16) 



r=l 



where x* is on the line in E connecting x, and Xj. If x\ ^ x -, for each r with x[ 7^ x^, 



r? 



T 7" 

i j _ rr 



rr,. 



ij x <! _ x ? l J" 
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It follows from (3.16) that, for each (i,j,q) with x\ ^ x q . 



v q fjxj) ~ f(Xj) 
ij 



Q 1 

A - x j 



< 



x't-x] 



r=l 



Likewise, 



< max|^(x)|^|r^.| 



r=l 



< dn 2 max I #4(x)| max I (S r )'(f%-| . (3.17) 

s;x€E 1 " x 1 -' 1 



Vr^(x,)| < dn 2 max ||£(s)| max \(S r )' (t*)ij\- (3-18) 

' — ' J s;x£E i,j,r 

r=l 



Let M be a constant bounding each |(5 ,r )'(t*)jj| on Jo and let C = 2dn 3 M. Substi- 
tuting (3.17) and (3.18) into (3.15) gives 



I dt J 



F(S(t))\ \\ < 2dn 3 max | J^(x)| max |(,S r )'(t*) 



s;x£E 



< C max 

s;x€E 



Of 
~Ux* 



(x)\ 



V t* € In- □ 



Proof of Theorem 3.6: 

First assume is convex. Let to G J- Let Jo be the interval around to and -E be the 
convex, precompact open set given in Lemma 3.7. Since / is a C 1 function and E is 
compact, a generalization of the Stone- Weierstrass Theorem in [5, pg 55] guarantees 
a sequence {4>k} of functions analytic on M. d such that 

\A(x)-f(x)\ < I and |§^(x)- J£(x)| < i VxG£ and 1 < r < d. 

Lemma 3.7 guarantees that, for each t* € ioj 

\\i* k (sm t = t * - i F (s(t))\t= t *\\ = iii(^-^)(^))i t=t ,n 

< Cmg. \ 9 -H=^(x)\ 

< Q. 
- it ' 

where C is a fixed constant. This implies 

{£ t ® k {S{t))\ t=tt } converges uniformly to &F(S(t))\ t=t , on I . 

By Proposition 3.2, each -^^k(S(t))\t=t* is continuous on /. Since the uniform limit 
of continuous functions is continuous, 4zF(S(t))\t=t* is continuous on Iq. 
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Now, let be an arbitrary domain. Fix to G I and let Iq be a bounded open interval 
of to with Iq C I. Let J? C l d be an open precompact set such that E C Q and 
a(S(t*)) C -E for all t* G Jo- Let O be an open set and if be a compact set such that 
E C O C K C ft and define a C°° bump function b(x) such that 

1 if x E E 
if x € K c . 



b(x): 

Now we can define a function g in C 1 (lR (i ,lR) by 

g(x) := 



b(x)f(x) if x G 
if x G ft c . 

As M d is convex, it follows from the previous result that £G(S(t))\t=t* is continuous 
on Iq. Since f(x) = g(x) in E, It follows from the formula in Proposition 3.5 that 

£ t F {S (t))\ t=t * = £ t G(S(t))\t=t* Vfe4 
and thus, is continuous in Jo- D 

Recall that CSf, possesses a Whitney stratification with pieces {M a }. Let Q, be an 
open set in M. d and let / G C 1 (ri,]R). Let V be an open set in CS d such that for all 
S G V, C 0. Define TV := UT(M a n V). Then, exists for all 5 G V and 

we can use the derivative results to define a map DF : TV — >■ TS* n . 

Specifically, fix an element in TV, which will consist of an S G V and A G TsM a , 
where M a is the piece containing S. Let S(t) be a C 1 curve in CS% such that 
5(0) = S and S"(0) = A. Define 

DF(S, A) := |F(5(t))| t=0 = + [Y, f(D)] )U* , 

r=l 

where U, D, T r , and Y are defined using 5 and A as in Proposition 3.5. It is easy to 
see that the map is well-defined and DF(S, •) is linear in A. In the following theorem, 
let 5 be in a piece M Q and let R be in a piece Mp of a Whitney stratification of CS^. 

Theorem 3.8 Let SI be an open set in IR d and V be an open set in CS d with o~(S) 
in n for all S G V. Iff G C 1 ^,^), then 

DF : TV —7- TS n is continuous. 

Specifically, if S G V with A G TgM a , then given e > 0, there exist S±, 82 > such 
that if ReV with A G TrM^, \\S - R\\ < Si, and ||A - A|| < <5 2 , then 

\\DF(S,A)-DF(R,A)\\<e. 

PROOF: The result for analytic functions follows from Equation (3.9). For an arbi- 
trary function /, and for R and A sufficiently close to S and A, bound \\DF(R, A) || 
in a manner similar to Lemma 3.7. The remainder of the proof is almost identical to 
that of Theorem 3.6 and is left as an exercise. □ 
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4 Higher Order Derivatives 

We now consider higher-order differentiation and for ease of notation, discuss only 
two- variable functions. We first clarify some notation. In earlier sections, (C 1 , • • • , C d ) 
referred to a point in C d . In this section, (Ci ? C2) denotes a point in C 2 . Previously, 
S(t) and T(t) denoted two separate curves in CS%. Now, S(t) and T(t) denote the 
two components of a single curve in CS^. 

Let (S(t),T(t)) be a C m curve in CS^ defined on an interval /. If m > 1, the curve 
is Lipschitz. By Theorem 2.5, there are Lipschitz curves 

(x s (t),y s (t)) forl<s<n, (4.19) 

defined on / representing the joint eigenvalues of (S(t),T(t)). Let U{t) be a unitary 
matrix diagonalizing (S(t),T(t)) so that the joint eigenvalues are ordered as in (4.19). 
To simplify notation, we write (S(t), T(t)) as (S, T). For I £ N with 1 < I < m, define 

S l :=S {l) (t) and T l := T®(t) 

and the set of pairs of index tuples 

k ■= {(*i,---,*fc)U(ifc + i,...,ij) :h-\ Vij = l,i q G N for 1 < q < j} . 

For example, I 2 = {(2) U 0, (1, 1) U 0, (1) U (1), U (1, 1), U (2)}. For notational 
ease, define 

U := U(t), x s := x s (t), y s := y s (t) for 1 < s < n. 
For some formulas, we will conjugate the derivatives in (4.20) by U* and so define 

:= U* S {1) U and A ; := U* T (i) U, for 1 < I < m. 

We will use the integral formula given in Lemma 3.4 and simplify it by defining 

#1 := (CiJ - S)- 1 and R 2 := {( 2 I - T)~\ 

where Ci an d C2 are in the resolvents of S and T respectively. Now, let J\ and J 2 
be open intervals in M and let / be an element of C m (J\ x J 2 ,M.). Fix j and k in 
N such that k < j < m. Fix k + 1 points X\, . . . , Xk+\ in J\ and j — k + 1 points 
Vi,- ■ -,Vj-k+i in Ji- Then 

/[^-^(a;!, . . . , Xk+1 ; yi, . . . , y^-fc+i) 

denotes the divided difference of / taken in the first variable k times and the second 
variable j — k times, evaluated at the given points. Finally, let denote the Schur 
product of two matrices. We will prove the following differentiability result: 
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Theorem 4.1 Let J\ and J 2 be open intervals in R and let f G C m {J\ x J2,R). 
Lei (S 1 , T) be a C m curve in CS^ defined on an interval I with joint eigenvalues in 
Ji x J 2 . For 1 < I < m and t* G /, ^F(S,T)\ t=t* exists and 



l ?ns,T)\ t=t , = u(Y: E ^^[/ [fej - fc] (^ 1 ,...,x Sfc+1 ; ysfc+1 ,..., ysj+1 ) 

\ r 1 1 ■ ' ' ' J ■ 

L J Sl,Sj + l = l / 



n 



Sl>Sj + l = l 



where the U , U* , T l , A J , x q and y r are evaluated at t*. 

Notice that the derivative formula in Theorem 4.1 requires / to be defined on pairs 
(x q ,y r ) for 1 < r, q < n, rather than just at the joint eigenvalues (x q ,y q ) of (S,T). 
This condition was not needed in Theorem 3.1. Before proving Theorem 4.1, we 
consider the case where / is real-analytic and show: 

Proposition 4.2 Let J\ and J2 be open intervals in R and let f be real-analytic on 
J\ x J 2 . Fix m G N and let (S,T) be a C' m curve in CS% defined on an interval L 
with joint eigenvalues in J\ x J 2 . Then jj^F{S,T) exists, has the form in Theorem 
4-1, and jj^F(S,T)\ t= t* is continuous as a function oft* on L. 

The proof of Proposition 4.2 requires the following two technical lemmas: 

Lemma 4.3 Let (S, T) be a C m curve in CS^ defined on an interval L. Let t* £ L 
and let Ci and C, 2 be elements in the resolvents of S(t*) andT(t*) respectively. Then 

Jr {R1R2) \ t=t * = E • 1 ~ ■ ; RiS h Ri ■ ■ ■ S tk R 1 R 2 T l ^R 2 ...T l >R 2 Vl<i<m, 
where each R\, R 2 , S l , and T J is evaluated at t*. 

Proof: The proof is a technical calculation using induction on I and the formulas 

f t R 1 = R 1 S l R l and f t R 2 = R 2 T 1 R 2 . □ 

Lemma 4.4 Let J\ and J 2 be open intervals in R and let f be real-analytic on 
J\ x J 2 . Let j > k G N. Choose k + 1 points x\, . . . , x^+i G J\ and j — k + 1 points 
yi, . . . , yj-k+i £ 'h- Extend f to be analytic on a complex rectangle Cl C C 2 such that 
each (x q ,y r ) G tl. Then fi k, i~ k \x\, . . . , yi, ■ ■ ■ , yj-k+i) exists and 

where C\ and C 2 are rectifiable curves strictly enclosing x\, . . . , a;^ +1 and j/i, . . . , yj-k+i 
respectively, such that C± x C 2 C Cl. 
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Proof: For a one-variable function, the formula is proven in [4, pg 2] and the 
two-variable analogue follows easily from the one variable case. □ 



Proof of Proposition 4.2: 

Use the integral formula in Lemma 3.4 to establish an integral formula for jj^F(S, T) 
similar to the first line of (3.9). Simplify the formula using Lemma 4.3. This for- 
mula implies that the derivative is continuous. Then, use Lemma 4.4 to convert the 
derivative into a formula involving the divided differences of /. The details are left 
as an exercise. □ 

Proof of Theorem 4.1: 

The result follows via induction on and the base case is covered by Theorem 3.1. For 
the inductive step, fix t* G /. Let p be a polynomial such that p and its derivatives to 
I th order agree with / at the points (x q (t*),y r (t*)) for 1 < q, r < n. Find a constant 
C such that for t near t*, 

|| |r^rm T) - i^P(S, T) || < C max | (/ - p) ^ (x si , .., x Sk+1 ; y ak+1 ,..,y 8j+1 )\, 

where the joint eigenvalues of (S,T) are given by (x q ,y q ) and the maximum is over 
(k, j) with k < j < I G N and sets {(si, •-, Sk+i) U (sk+i, .., Sj+i) : 1 < si..Sj + i < n} . 
As in Theorem 3.1, apply the multivariate Mean Value theorem to each (/ — p^ k ^~ k ^ 
and use the Lipschitz property of the eigenvalues to conclude 

^F(S,T)\ t=t * exists and equals £p{S,T)\ t=t * . 

The details are left as an exercise. □ 



We now show that the formula in Theorem 4.1 is continuous. 

Theorem 4.5 Let J\ and J2 be open intervals in R and f € C m (J\ x J2,R). Let 
(S, T) be a C m curve in CS^ defined on an interval I with joint eigenvalues in J\ x J2. 
Then for all I G N with 1 <l <m, 

-^jF(S,T)\ t =t* is continuous as a function oft* on I. 

For the proof, we require the following lemma. The result is well-known for one- 
variable functions, and Brown and Vasudeva prove this two- variable analogue in [3]: 

Lemma 4.6 Let J\ and J2 be open intervals in R and let f G C rn ( Ji x J 2 , R). Choose 
j,k G N with k < j < m. Let x±,..., Xk+i G J\ and y±,..., yj-k+i £ J2 o,nd choose 
closed subintervals J\ and J2 containing the x and y points respectively. Then, there 
exists (x*,y*) G J\ x J2 with 

.[kj-k], _ ^ _/<**-*>(*•,*•) 



ftJ (xi,...,z fe+ i;yi,..., % _ fe+ i) 



k\{j - k)\ 
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Proof of Theorem 4.5: 

For / < m, the result follows from Theorem 4.1, which implies that jjgF(S,T) is 
differentiable and, hence, continuous. 

For I = m, fix to G L As in Lemma 3.7, find a constant C and an open precompact 
convex set J with J C J\ x J 2 such that, for all g G C m {J\ x J2,K) and i* near i 0) 

11^(5,701^.11 <C7 max |/<**- fc W)l> 

; (x,y)eJ} 

where < k < j < m. The estimates for this bound require Lemma 4.6. Then, 
approximate / to m th order uniformly on J by analytic functions {4> r } and show 

{^<f> r (S,T)\ t=t *} converges uniformly to ^F(S,T)\ t=t * 

in a neighborhood of to- The result then follows from Proposition 4.2. □ 

5 Applications 

The formulas in Proposition 3.5 and Theorem 4.1 can be used to analyze monotonicity 
and convexity of matrix functions. A function F : S n — > S n is matrix monotone if 

F{A) > F(B) whenever A > B V A,B e S n . 

For F continuously differentiable, an equivalent condition is 

£ t F(S(t))\ t =t* > whenever S'(t*) > 0, V C 1 S(t) G S n . (5.20) 

The local monotonicity condition in (5.20) extends to multivariate matrix functions: 
the only adjustment is that S(t) is in CSf r In [1], Agler, McCarthy, and Young 
characterized such locally monotone matrix functions on CSf t using a special case 
of Theorem 3.1 and Proposition 3.5. Specifically, they had to assume that S(t) had 
distinct joint eigenvalues at each t. Our results in Section 3 extend the derivative 
formula to general C 1 curves in CS^ and show that the formula is continuous. 

A matrix function F : S n —¥ S n is matrix convex if 

F(XA + (1 - X)B) < XF(A) + (1 - X)F(B) V A, B e S n , A G [0, 1]. (5.21) 

This condition extends to multivariate matrix functions with an additional restriction 
on the pairs A,B in CS%; we also require XA + (1 — X)B G CS^ for A G [0, 1]. Given 
such A, B, define S(t) on [0, 1] by 

S r (t) :=tA r + (l-t)B r Vl<r<d. 
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If F is twice continuously differentiable, it can be shown that (5.21) is equivalent to 

^F(S(t))\ t=t * > V such S(t), t* € [0, 1]. (5.22) 

Assume F was defined using a real function / as in (1.1). For d = 2, Theorem 4.1 
tells us that, up to conjugation by a unitary matrix U, 

n 

= 2^2f [2 ' 0] (xi,Xk,x j ;y j )T ik r kj + f [1 ' 1] (x h x k ; y k , yj)T ik A kj 
k=l 

+ f [0,2 Kxi;yi,y k ,yj)A ik A kj , 

where {{xi, yi) : 1 < i < n} are the joint eigenvalues of t*A + (1 — t*)B and 

r := U* (Ai - Bi) U and A := U* (A 2 - B 2 ) U. 

This formula can be simplified using the relationship between T and A discussed in 
Theorem 2.3. Specifically, we know 

(xi - Xj)Aij = (yi - yj)Yij V 1 < i, j < n. 

Thus, this formula gives a characterization of convex matrix functions on CS 2 n . 
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